Analytics on GeodataFrames - COVID 19 CASEΒΆ

In the selected case, we will focus on positive cases, centering on the population most vulnerable to COVID-19, which includes middle-aged adults (40-59 years) and older adults (60+ years).

First, we read the data stored in Google Drive.

InΒ [297]:
import pandas as pd

# Lee el archivo especificando el delimitador como ";"
covid19 = pd.read_csv("positivos_covid.csv", delimiter=';')

covid19.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 4585360 entries, 0 to 4585359
Data columns (total 10 columns):
 #   Column           Dtype  
---  ------           -----  
 0   FECHA_CORTE      int64  
 1   DEPARTAMENTO     object 
 2   PROVINCIA        object 
 3   DISTRITO         object 
 4   METODODX         object 
 5   EDAD             float64
 6   SEXO             object 
 7   FECHA_RESULTADO  float64
 8   UBIGEO           float64
 9   id_persona       float64
dtypes: float64(4), int64(1), object(5)
memory usage: 349.8+ MB
InΒ [299]:
#check
covid19.head()
Out[299]:
FECHA_CORTE DEPARTAMENTO PROVINCIA DISTRITO METODODX EDAD SEXO FECHA_RESULTADO UBIGEO id_persona
0 20241203 TUMBES TUMBES TUMBES AG 46.0 FEMENINO 20221207.0 240101.0 203499.0
1 20241203 LIMA LIMA JESUS MARIA AG 69.0 FEMENINO 20230822.0 150113.0 221397.0
2 20241203 SAN MARTIN MOYOBAMBA MOYOBAMBA AG 55.0 FEMENINO 20240108.0 220101.0 295651.0
3 20241203 AREQUIPA CAYLLOMA COPORAQUE AG 50.0 MASCULINO 20230824.0 40506.0 851625.0
4 20241203 LIMA LIMA JESUS MARIA AG 58.0 MASCULINO 20221217.0 150113.0 287786.0

We begin data cleaningΒΆ

InΒ [301]:
covid19 = covid19.drop(columns=['FECHA_CORTE', 'METODODX', 'id_persona'])

#check
covid19.head()
Out[301]:
DEPARTAMENTO PROVINCIA DISTRITO EDAD SEXO FECHA_RESULTADO UBIGEO
0 TUMBES TUMBES TUMBES 46.0 FEMENINO 20221207.0 240101.0
1 LIMA LIMA JESUS MARIA 69.0 FEMENINO 20230822.0 150113.0
2 SAN MARTIN MOYOBAMBA MOYOBAMBA 55.0 FEMENINO 20240108.0 220101.0
3 AREQUIPA CAYLLOMA COPORAQUE 50.0 MASCULINO 20230824.0 40506.0
4 LIMA LIMA JESUS MARIA 58.0 MASCULINO 20221217.0 150113.0
InΒ [302]:
# Extraemos solo el aΓ±o de la columna FECHA_RESULTADO en el DataFrame covid19
covid19['FECHA_RESULTADO'] = covid19['FECHA_RESULTADO'].astype(str).str[:4]

# Eliminar filas con NaN en EDAD
covid19 = covid19.dropna(subset=['EDAD'])
covid19['EDAD'] = covid19['EDAD'].astype(int)


# Convertimos la columna EDAD a enteros para remover el ".0"
covid19['EDAD'] = covid19['EDAD'].astype(int)

#check
covid19.head()
Out[302]:
DEPARTAMENTO PROVINCIA DISTRITO EDAD SEXO FECHA_RESULTADO UBIGEO
0 TUMBES TUMBES TUMBES 46 FEMENINO 2022 240101.0
1 LIMA LIMA JESUS MARIA 69 FEMENINO 2023 150113.0
2 SAN MARTIN MOYOBAMBA MOYOBAMBA 55 FEMENINO 2024 220101.0
3 AREQUIPA CAYLLOMA COPORAQUE 50 MASCULINO 2023 40506.0
4 LIMA LIMA JESUS MARIA 58 MASCULINO 2022 150113.0
InΒ [303]:
# years in data
covid19.FECHA_RESULTADO.value_counts()
Out[303]:
FECHA_RESULTADO
2022    2132009
2021    1307581
2020    1022565
2023      93361
2024      27074
nan        2023
1899        394
Name: count, dtype: int64
InΒ [304]:
# Primero eliminamos los valores NaN de la columna FECHA_RESULTADO y luego filtramos los valores no deseados como '1899'
covid19 = covid19[~covid19['FECHA_RESULTADO'].isin(['nan'])]

# Convertimos FECHA_RESULTADO a string por seguridad y filtramos los valores no deseados
covid19 = covid19[~covid19['FECHA_RESULTADO'].isin(['1899'])]
InΒ [305]:
# Verificamos que tenemos la periodizaciΓ³n correcta
covid19.FECHA_RESULTADO.value_counts()
Out[305]:
FECHA_RESULTADO
2022    2132009
2021    1307581
2020    1022565
2023      93361
2024      27074
Name: count, dtype: int64
InΒ [306]:
# Mostramos los valores mΓ­nimo y mΓ‘ximo en la columna 'EDAD' del DataFrame, para verificar que estΓ‘ todo ok
edad_min = covid19['EDAD'].min()
edad_max = covid19['EDAD'].max()

edad_min, edad_max
Out[306]:
(0, 125)
InΒ [307]:
# Creamos una nueva columna 'Grupo_Edad' en el DataFrame covid19 con las categorΓ­as de edad especificadas
covid19['Grupo_Edad'] = pd.cut(
    covid19['EDAD'],
    bins=[0, 17, 39, 59, float('inf')],
    labels=["NiΓ±os y adolescentes (0-17 aΓ±os)", "Adultos jΓ³venes (18-39 aΓ±os)", "Adultos de mediana edad (40-59 aΓ±os)", "Personas mayores (60+ aΓ±os)"]
)
covid19.head()
Out[307]:
DEPARTAMENTO PROVINCIA DISTRITO EDAD SEXO FECHA_RESULTADO UBIGEO Grupo_Edad
0 TUMBES TUMBES TUMBES 46 FEMENINO 2022 240101.0 Adultos de mediana edad (40-59 aΓ±os)
1 LIMA LIMA JESUS MARIA 69 FEMENINO 2023 150113.0 Personas mayores (60+ aΓ±os)
2 SAN MARTIN MOYOBAMBA MOYOBAMBA 55 FEMENINO 2024 220101.0 Adultos de mediana edad (40-59 aΓ±os)
3 AREQUIPA CAYLLOMA COPORAQUE 50 MASCULINO 2023 40506.0 Adultos de mediana edad (40-59 aΓ±os)
4 LIMA LIMA JESUS MARIA 58 MASCULINO 2022 150113.0 Adultos de mediana edad (40-59 aΓ±os)
InΒ [308]:
covid19.Grupo_Edad.value_counts()
Out[308]:
Grupo_Edad
Adultos jΓ³venes (18-39 aΓ±os)            2044590
Adultos de mediana edad (40-59 aΓ±os)    1485974
Personas mayores (60+ aΓ±os)              726577
NiΓ±os y adolescentes (0-17 aΓ±os)         308273
Name: count, dtype: int64
InΒ [309]:
# Filtrar el DataFrame para excluir los grupos etarios especificados
covid19_vulnerables = covid19[~covid19['Grupo_Edad'].isin(["NiΓ±os y adolescentes (0-17 aΓ±os)", "Adultos jΓ³venes (18-39 aΓ±os)"])]
covid19_vulnerables.Grupo_Edad.value_counts()
Out[309]:
Grupo_Edad
Adultos de mediana edad (40-59 aΓ±os)    1485974
Personas mayores (60+ aΓ±os)              726577
NiΓ±os y adolescentes (0-17 aΓ±os)              0
Adultos jΓ³venes (18-39 aΓ±os)                  0
Name: count, dtype: int64

Reshaping to LongΒΆ

We keep only the two most vulnerable groups, People per level, by distrit by year:

InΒ [311]:
indexList=['FECHA_RESULTADO','DEPARTAMENTO','PROVINCIA','Grupo_Edad']
aggregator={'Grupo_Edad':[len]}
covid19_vulnerables=covid19_vulnerables.groupby(indexList,observed=True).agg(aggregator)
covid19_vulnerables
Out[311]:
Grupo_Edad
len
FECHA_RESULTADO DEPARTAMENTO PROVINCIA Grupo_Edad
2020 AMAZONAS BAGUA Adultos de mediana edad (40-59 aΓ±os) 2580
Personas mayores (60+ aΓ±os) 1521
BONGARA Adultos de mediana edad (40-59 aΓ±os) 129
Personas mayores (60+ aΓ±os) 69
CHACHAPOYAS Adultos de mediana edad (40-59 aΓ±os) 696
... ... ... ... ...
2024 TUMBES ZARUMILLA Adultos de mediana edad (40-59 aΓ±os) 5
Personas mayores (60+ aΓ±os) 4
UCAYALI CORONEL PORTILLO Adultos de mediana edad (40-59 aΓ±os) 38
Personas mayores (60+ aΓ±os) 19
PADRE ABAD Adultos de mediana edad (40-59 aΓ±os) 2

2039 rows Γ— 1 columns

Sending the counts to wide columns:

InΒ [313]:
Covid19Draft=covid19_vulnerables.unstack(3).fillna(0) #leftmost index in rows
Covid19Draft
Out[313]:
Grupo_Edad
len
Grupo_Edad Adultos de mediana edad (40-59 aΓ±os) Personas mayores (60+ aΓ±os)
FECHA_RESULTADO DEPARTAMENTO PROVINCIA
2020 AMAZONAS BAGUA 2580.0 1521.0
BONGARA 129.0 69.0
CHACHAPOYAS 696.0 262.0
CONDORCANQUI 922.0 288.0
EN INVESTIGACIΓ“N 17.0 18.0
... ... ... ... ...
2024 TUMBES CONTRALMIRANTE VILLAR 0.0 4.0
TUMBES 17.0 15.0
ZARUMILLA 5.0 4.0
UCAYALI CORONEL PORTILLO 38.0 19.0
PADRE ABAD 2.0 0.0

1050 rows Γ— 2 columns

InΒ [314]:
Covid19Draft['ALARMA_pct']=Covid19Draft.iloc[:,1]/(Covid19Draft.iloc[:,0] + Covid19Draft.iloc[:,1])
covid19_vulnerables_Alarm_w=Covid19Draft['ALARMA_pct'].unstack('FECHA_RESULTADO').fillna(0)
covid19_vulnerables_Alarm_w
Out[314]:
FECHA_RESULTADO 2020 2021 2022 2023 2024
DEPARTAMENTO PROVINCIA
AMAZONAS BAGUA 0.370885 0.391144 0.339266 0.533333 0.458333
BONGARA 0.348485 0.363825 0.305233 0.500000 0.600000
CHACHAPOYAS 0.273486 0.321394 0.268201 0.417476 0.440860
CONDORCANQUI 0.238017 0.339367 0.205714 0.000000 0.000000
EN INVESTIGACIΓ“N 0.514286 0.392857 0.458333 0.333333 0.000000
... ... ... ... ... ... ...
UCAYALI ATALAYA 0.325243 0.241379 0.344828 0.000000 0.000000
CORONEL PORTILLO 0.387321 0.342441 0.328023 0.404255 0.333333
EN INVESTIGACIΓ“N 0.335516 0.375000 0.255208 0.500000 0.000000
PADRE ABAD 0.309686 0.332174 0.279487 0.071429 0.000000
PURUS 0.224599 0.300000 0.172414 0.000000 0.000000

221 rows Γ— 5 columns

Notice the data type:

InΒ [316]:
covid19_vulnerables_Alarm_w.columns
Out[316]:
Index(['2020', '2021', '2022', '2023', '2024'], dtype='object', name='FECHA_RESULTADO')

We should have text not numbers:

InΒ [318]:
covid19_vulnerables_Alarm_w.columns=['year'+str(x) for x in covid19_vulnerables_Alarm_w.columns]
InΒ [319]:
#then
covid19_vulnerables_Alarm_w
Out[319]:
year2020 year2021 year2022 year2023 year2024
DEPARTAMENTO PROVINCIA
AMAZONAS BAGUA 0.370885 0.391144 0.339266 0.533333 0.458333
BONGARA 0.348485 0.363825 0.305233 0.500000 0.600000
CHACHAPOYAS 0.273486 0.321394 0.268201 0.417476 0.440860
CONDORCANQUI 0.238017 0.339367 0.205714 0.000000 0.000000
EN INVESTIGACIΓ“N 0.514286 0.392857 0.458333 0.333333 0.000000
... ... ... ... ... ... ...
UCAYALI ATALAYA 0.325243 0.241379 0.344828 0.000000 0.000000
CORONEL PORTILLO 0.387321 0.342441 0.328023 0.404255 0.333333
EN INVESTIGACIΓ“N 0.335516 0.375000 0.255208 0.500000 0.000000
PADRE ABAD 0.309686 0.332174 0.279487 0.071429 0.000000
PURUS 0.224599 0.300000 0.172414 0.000000 0.000000

221 rows Γ— 5 columns

InΒ [320]:
# as usual
covid19_vulnerables_Alarm_w.reset_index(inplace=True)
covid19_vulnerables_Alarm_w
Out[320]:
DEPARTAMENTO PROVINCIA year2020 year2021 year2022 year2023 year2024
0 AMAZONAS BAGUA 0.370885 0.391144 0.339266 0.533333 0.458333
1 AMAZONAS BONGARA 0.348485 0.363825 0.305233 0.500000 0.600000
2 AMAZONAS CHACHAPOYAS 0.273486 0.321394 0.268201 0.417476 0.440860
3 AMAZONAS CONDORCANQUI 0.238017 0.339367 0.205714 0.000000 0.000000
4 AMAZONAS EN INVESTIGACIΓ“N 0.514286 0.392857 0.458333 0.333333 0.000000
... ... ... ... ... ... ... ...
216 UCAYALI ATALAYA 0.325243 0.241379 0.344828 0.000000 0.000000
217 UCAYALI CORONEL PORTILLO 0.387321 0.342441 0.328023 0.404255 0.333333
218 UCAYALI EN INVESTIGACIΓ“N 0.335516 0.375000 0.255208 0.500000 0.000000
219 UCAYALI PADRE ABAD 0.309686 0.332174 0.279487 0.071429 0.000000
220 UCAYALI PURUS 0.224599 0.300000 0.172414 0.000000 0.000000

221 rows Γ— 7 columns

InΒ [321]:
!pip install geopandas
Requirement already satisfied: geopandas in c:\users\luis\anaconda3\lib\site-packages (0.14.2)
Requirement already satisfied: fiona>=1.8.21 in c:\users\luis\anaconda3\lib\site-packages (from geopandas) (1.9.5)
Requirement already satisfied: packaging in c:\users\luis\anaconda3\lib\site-packages (from geopandas) (23.2)
Requirement already satisfied: pandas>=1.4.0 in c:\users\luis\anaconda3\lib\site-packages (from geopandas) (2.2.2)
Requirement already satisfied: pyproj>=3.3.0 in c:\users\luis\anaconda3\lib\site-packages (from geopandas) (3.6.1)
Requirement already satisfied: shapely>=1.8.0 in c:\users\luis\anaconda3\lib\site-packages (from geopandas) (2.0.5)
Requirement already satisfied: attrs>=19.2.0 in c:\users\luis\anaconda3\lib\site-packages (from fiona>=1.8.21->geopandas) (23.1.0)
Requirement already satisfied: certifi in c:\users\luis\anaconda3\lib\site-packages (from fiona>=1.8.21->geopandas) (2024.8.30)
Requirement already satisfied: click~=8.0 in c:\users\luis\anaconda3\lib\site-packages (from fiona>=1.8.21->geopandas) (8.1.7)
Requirement already satisfied: click-plugins>=1.0 in c:\users\luis\anaconda3\lib\site-packages (from fiona>=1.8.21->geopandas) (1.1.1)
Requirement already satisfied: cligj>=0.5 in c:\users\luis\anaconda3\lib\site-packages (from fiona>=1.8.21->geopandas) (0.7.2)
Requirement already satisfied: six in c:\users\luis\anaconda3\lib\site-packages (from fiona>=1.8.21->geopandas) (1.16.0)
Requirement already satisfied: setuptools in c:\users\luis\anaconda3\lib\site-packages (from fiona>=1.8.21->geopandas) (69.5.1)
Requirement already satisfied: numpy>=1.26.0 in c:\users\luis\anaconda3\lib\site-packages (from pandas>=1.4.0->geopandas) (1.26.4)
Requirement already satisfied: python-dateutil>=2.8.2 in c:\users\luis\anaconda3\lib\site-packages (from pandas>=1.4.0->geopandas) (2.9.0.post0)
Requirement already satisfied: pytz>=2020.1 in c:\users\luis\anaconda3\lib\site-packages (from pandas>=1.4.0->geopandas) (2024.1)
Requirement already satisfied: tzdata>=2022.7 in c:\users\luis\anaconda3\lib\site-packages (from pandas>=1.4.0->geopandas) (2023.3)
Requirement already satisfied: colorama in c:\users\luis\anaconda3\lib\site-packages (from click~=8.0->fiona>=1.8.21->geopandas) (0.4.6)

Let's call a map:

InΒ [323]:
mapLink='https://github.com/SocialAnalytics-StrategicIntelligence/GeoDF_Analytics/raw/main/maps/ProvsINEI2023.zip'

import geopandas as gpd

provmap=gpd.read_file(mapLink)

provmap.info()
<class 'geopandas.geodataframe.GeoDataFrame'>
RangeIndex: 196 entries, 0 to 195
Data columns (total 6 columns):
 #   Column      Non-Null Count  Dtype   
---  ------      --------------  -----   
 0   OBJECTID    196 non-null    float64 
 1   CCDD        196 non-null    object  
 2   CCPP        196 non-null    object  
 3   DEPARTAMEN  196 non-null    object  
 4   PROVINCIA   196 non-null    object  
 5   geometry    196 non-null    geometry
dtypes: float64(1), geometry(1), object(4)
memory usage: 9.3+ KB

Let me create a column, concatenating two:

InΒ [325]:
provmap['location']=['+'.join(x[0]) for x in zip(provmap.iloc[:,3:5].values)]
provmap.head(10)
Out[325]:
OBJECTID CCDD CCPP DEPARTAMEN PROVINCIA geometry location
0 1.0 01 01 AMAZONAS CHACHAPOYAS POLYGON ((-77.72614 -5.94354, -77.72486 -5.943... AMAZONAS+CHACHAPOYAS
1 2.0 01 02 AMAZONAS BAGUA POLYGON ((-78.61909 -4.51001, -78.61802 -4.510... AMAZONAS+BAGUA
2 3.0 01 03 AMAZONAS BONGARA POLYGON ((-77.72759 -5.14030, -77.72361 -5.140... AMAZONAS+BONGARA
3 4.0 01 04 AMAZONAS CONDORCANQUI POLYGON ((-77.81399 -2.99278, -77.81483 -2.995... AMAZONAS+CONDORCANQUI
4 5.0 01 05 AMAZONAS LUYA POLYGON ((-78.13023 -5.90370, -78.13011 -5.904... AMAZONAS+LUYA
5 6.0 01 06 AMAZONAS RODRIGUEZ DE MENDOZA POLYGON ((-77.44452 -6.05002, -77.44387 -6.050... AMAZONAS+RODRIGUEZ DE MENDOZA
6 7.0 01 07 AMAZONAS UTCUBAMBA POLYGON ((-78.09288 -5.36258, -78.09288 -5.364... AMAZONAS+UTCUBAMBA
7 8.0 02 01 ANCASH HUARAZ POLYGON ((-77.39870 -9.35563, -77.39852 -9.356... ANCASH+HUARAZ
8 9.0 02 02 ANCASH AIJA POLYGON ((-77.61368 -9.64900, -77.61241 -9.649... ANCASH+AIJA
9 10.0 02 03 ANCASH ANTONIO RAYMONDI POLYGON ((-77.08856 -8.97496, -77.08804 -8.975... ANCASH+ANTONIO RAYMONDI

I will do the same with the data frame:

InΒ [327]:
covid19_vulnerables_Alarm_w['location']=['+'.join(x[0]) for x in zip(covid19_vulnerables_Alarm_w.iloc[:,:2].values)]
covid19_vulnerables_Alarm_w.head()
Out[327]:
DEPARTAMENTO PROVINCIA year2020 year2021 year2022 year2023 year2024 location
0 AMAZONAS BAGUA 0.370885 0.391144 0.339266 0.533333 0.458333 AMAZONAS+BAGUA
1 AMAZONAS BONGARA 0.348485 0.363825 0.305233 0.500000 0.600000 AMAZONAS+BONGARA
2 AMAZONAS CHACHAPOYAS 0.273486 0.321394 0.268201 0.417476 0.440860 AMAZONAS+CHACHAPOYAS
3 AMAZONAS CONDORCANQUI 0.238017 0.339367 0.205714 0.000000 0.000000 AMAZONAS+CONDORCANQUI
4 AMAZONAS EN INVESTIGACIΓ“N 0.514286 0.392857 0.458333 0.333333 0.000000 AMAZONAS+EN INVESTIGACIΓ“N

PreprocessingΒΆ

The names from non-english speaking countries may come with some symbols that may cause trouble (', ~). Let's get rid of those:

InΒ [330]:
!pip install unidecode
Requirement already satisfied: unidecode in c:\users\luis\anaconda3\lib\site-packages (1.2.0)
InΒ [331]:
import unidecode


byePunctuation=lambda x: unidecode.unidecode(x)
covid19_vulnerables_Alarm_w['location']=covid19_vulnerables_Alarm_w['location'].apply(byePunctuation)
provmap['location']=provmap['location'].apply(byePunctuation)
InΒ [332]:
# replacing dashes and multiple spaces by a simple space
covid19_vulnerables_Alarm_w['location']=covid19_vulnerables_Alarm_w.location.str.replace("\-|\_|\s+","",regex=True)
provmap['location']=provmap.location.str.replace("\-|\_|\s+","",regex=True)
<>:2: SyntaxWarning: invalid escape sequence '\-'
<>:3: SyntaxWarning: invalid escape sequence '\-'
<>:2: SyntaxWarning: invalid escape sequence '\-'
<>:3: SyntaxWarning: invalid escape sequence '\-'
C:\Users\Luis\AppData\Local\Temp\ipykernel_12156\1514654713.py:2: SyntaxWarning: invalid escape sequence '\-'
  covid19_vulnerables_Alarm_w['location']=covid19_vulnerables_Alarm_w.location.str.replace("\-|\_|\s+","",regex=True)
C:\Users\Luis\AppData\Local\Temp\ipykernel_12156\1514654713.py:3: SyntaxWarning: invalid escape sequence '\-'
  provmap['location']=provmap.location.str.replace("\-|\_|\s+","",regex=True)

MergingΒΆ

We need to merge both tables now. That can happen effectively if both tables have a key column: a column (or collection of them) whose values in one table are the same in the other one.

The match need not be exact, but only common values in the key are merged.

Let's find out what is NOT matched in each table:

InΒ [334]:
nomatch_df=set(covid19_vulnerables_Alarm_w.location)- set(provmap.location)
nomatch_gdf=set(provmap.location)-set(covid19_vulnerables_Alarm_w.location)

This is what could not be matched:

InΒ [336]:
len(nomatch_df), len(nomatch_gdf)
Out[336]:
(27, 2)

The right way to go is using fuzzy merging (remember we need the fuzz):

InΒ [338]:
!pip install thefuzz
Requirement already satisfied: thefuzz in c:\users\luis\anaconda3\lib\site-packages (0.22.1)
Requirement already satisfied: rapidfuzz<4.0.0,>=3.0.0 in c:\users\luis\anaconda3\lib\site-packages (from thefuzz) (3.10.1)
InΒ [339]:
# pick the closest match from nomatch_gdf for a value in nomatch_df
from thefuzz import process
[(dis,process.extractOne(dis,nomatch_gdf)) for dis in sorted(nomatch_df)]
Out[339]:
[('AMAZONAS+ENINVESTIGACION', ('ANCASH+ANTONIORAYMONDI', 48)),
 ('ANCASH+ANTONIORAIMONDI', ('ANCASH+ANTONIORAYMONDI', 95)),
 ('ANCASH+ENINVESTIGACION', ('ANCASH+ANTONIORAYMONDI', 59)),
 ('APURIMAC+ENINVESTIGACION', ('ICA+NASCA', 40)),
 ('AREQUIPA+ENINVESTIGACION', ('ICA+NASCA', 40)),
 ('AYACUCHO+ENINVESTIGACION', ('ANCASH+ANTONIORAYMONDI', 43)),
 ('CAJAMARCA+ENINVESTIGACION', ('ICA+NASCA', 50)),
 ('CALLAO+ENINVESTIGACION', ('ANCASH+ANTONIORAYMONDI', 41)),
 ('CUSCO+ENINVESTIGACION', ('ANCASH+ANTONIORAYMONDI', 42)),
 ('HUANCAVELICA+ENINVESTIGACION', ('ICA+NASCA', 50)),
 ('HUANUCO+ENINVESTIGACION', ('ANCASH+ANTONIORAYMONDI', 44)),
 ('ICA+ENINVESTIGACION', ('ICA+NASCA', 86)),
 ('ICA+NAZCA', ('ICA+NASCA', 89)),
 ('JUNIN+ENINVESTIGACION', ('ICA+NASCA', 40)),
 ('LALIBERTAD+ENINVESTIGACION', ('ICA+NASCA', 40)),
 ('LAMBAYEQUE+ENINVESTIGACION', ('ICA+NASCA', 40)),
 ('LIMA+ENINVESTIGACION', ('ICA+NASCA', 45)),
 ('LORETO+ENINVESTIGACION', ('ICA+NASCA', 40)),
 ('MADREDEDIOS+ENINVESTIGACION', ('ICA+NASCA', 40)),
 ('MOQUEGUA+ENINVESTIGACION', ('ICA+NASCA', 40)),
 ('PASCO+ENINVESTIGACION', ('ICA+NASCA', 48)),
 ('PIURA+ENINVESTIGACION', ('ICA+NASCA', 42)),
 ('PUNO+ENINVESTIGACION', ('ICA+NASCA', 40)),
 ('SANMARTIN+ENINVESTIGACION', ('ANCASH+ANTONIORAYMONDI', 43)),
 ('TACNA+ENINVESTIGACION', ('ICA+NASCA', 48)),
 ('TUMBES+ENINVESTIGACION', ('ICA+NASCA', 40)),
 ('UCAYALI+ENINVESTIGACION', ('ANCASH+ANTONIORAYMONDI', 40))]

If you are comfortable, you prepare a dictionary of changes:

InΒ [341]:
# is this OK?
{dis:process.extractOne(dis,nomatch_gdf)[0] for dis in sorted(nomatch_df)}
Out[341]:
{'AMAZONAS+ENINVESTIGACION': 'ANCASH+ANTONIORAYMONDI',
 'ANCASH+ANTONIORAIMONDI': 'ANCASH+ANTONIORAYMONDI',
 'ANCASH+ENINVESTIGACION': 'ANCASH+ANTONIORAYMONDI',
 'APURIMAC+ENINVESTIGACION': 'ICA+NASCA',
 'AREQUIPA+ENINVESTIGACION': 'ICA+NASCA',
 'AYACUCHO+ENINVESTIGACION': 'ANCASH+ANTONIORAYMONDI',
 'CAJAMARCA+ENINVESTIGACION': 'ICA+NASCA',
 'CALLAO+ENINVESTIGACION': 'ANCASH+ANTONIORAYMONDI',
 'CUSCO+ENINVESTIGACION': 'ANCASH+ANTONIORAYMONDI',
 'HUANCAVELICA+ENINVESTIGACION': 'ICA+NASCA',
 'HUANUCO+ENINVESTIGACION': 'ANCASH+ANTONIORAYMONDI',
 'ICA+ENINVESTIGACION': 'ICA+NASCA',
 'ICA+NAZCA': 'ICA+NASCA',
 'JUNIN+ENINVESTIGACION': 'ICA+NASCA',
 'LALIBERTAD+ENINVESTIGACION': 'ICA+NASCA',
 'LAMBAYEQUE+ENINVESTIGACION': 'ICA+NASCA',
 'LIMA+ENINVESTIGACION': 'ICA+NASCA',
 'LORETO+ENINVESTIGACION': 'ICA+NASCA',
 'MADREDEDIOS+ENINVESTIGACION': 'ICA+NASCA',
 'MOQUEGUA+ENINVESTIGACION': 'ICA+NASCA',
 'PASCO+ENINVESTIGACION': 'ICA+NASCA',
 'PIURA+ENINVESTIGACION': 'ICA+NASCA',
 'PUNO+ENINVESTIGACION': 'ICA+NASCA',
 'SANMARTIN+ENINVESTIGACION': 'ANCASH+ANTONIORAYMONDI',
 'TACNA+ENINVESTIGACION': 'ICA+NASCA',
 'TUMBES+ENINVESTIGACION': 'ICA+NASCA',
 'UCAYALI+ENINVESTIGACION': 'ANCASH+ANTONIORAYMONDI'}
InΒ [342]:
# then:
changesinDF={dis:process.extractOne(dis,nomatch_gdf)[0] for dis in sorted(nomatch_df)}

Now, make the replacements:

InΒ [344]:
covid19_vulnerables_Alarm_w.replace({'location': changesinDF}, inplace=True)

Is it over?

InΒ [346]:
nomatch_df=set(covid19_vulnerables_Alarm_w.location)- set(provmap.location)
nomatch_gdf=set(provmap.location)-set(covid19_vulnerables_Alarm_w.location)

[(dis,process.extractOne(dis,nomatch_gdf)) for dis in sorted(nomatch_df)]
Out[346]:
[]

Now the merge can happen:

InΒ [348]:
covid19_vulnerables_Alarm_map=provmap.merge(covid19_vulnerables_Alarm_w, on='location',how='left',indicator='flag')
InΒ [349]:
# check
covid19_vulnerables_Alarm_map.info()
<class 'geopandas.geodataframe.GeoDataFrame'>
RangeIndex: 221 entries, 0 to 220
Data columns (total 15 columns):
 #   Column        Non-Null Count  Dtype   
---  ------        --------------  -----   
 0   OBJECTID      221 non-null    float64 
 1   CCDD          221 non-null    object  
 2   CCPP          221 non-null    object  
 3   DEPARTAMEN    221 non-null    object  
 4   PROVINCIA_x   221 non-null    object  
 5   geometry      221 non-null    geometry
 6   location      221 non-null    object  
 7   DEPARTAMENTO  221 non-null    object  
 8   PROVINCIA_y   221 non-null    object  
 9   year2020      221 non-null    float64 
 10  year2021      221 non-null    float64 
 11  year2022      221 non-null    float64 
 12  year2023      221 non-null    float64 
 13  year2024      221 non-null    float64 
 14  flag          221 non-null    category
dtypes: category(1), float64(6), geometry(1), object(7)
memory usage: 24.6+ KB
InΒ [350]:
# avoid poblems with fillna()
covid19_vulnerables_Alarm_map['flag']=covid19_vulnerables_Alarm_map.flag.astype(str)

We can get rid of some columns:

InΒ [352]:
covid19_vulnerables_Alarm_map.info()
<class 'geopandas.geodataframe.GeoDataFrame'>
RangeIndex: 221 entries, 0 to 220
Data columns (total 15 columns):
 #   Column        Non-Null Count  Dtype   
---  ------        --------------  -----   
 0   OBJECTID      221 non-null    float64 
 1   CCDD          221 non-null    object  
 2   CCPP          221 non-null    object  
 3   DEPARTAMEN    221 non-null    object  
 4   PROVINCIA_x   221 non-null    object  
 5   geometry      221 non-null    geometry
 6   location      221 non-null    object  
 7   DEPARTAMENTO  221 non-null    object  
 8   PROVINCIA_y   221 non-null    object  
 9   year2020      221 non-null    float64 
 10  year2021      221 non-null    float64 
 11  year2022      221 non-null    float64 
 12  year2023      221 non-null    float64 
 13  year2024      221 non-null    float64 
 14  flag          221 non-null    object  
dtypes: float64(6), geometry(1), object(8)
memory usage: 26.0+ KB
InΒ [353]:
bye=['DEPARTAMENTO', 'CCPP','CCDD']
covid19_vulnerables_Alarm_map.drop(columns=bye,inplace=True)

# keeping
covid19_vulnerables_Alarm_map.head()
Out[353]:
OBJECTID DEPARTAMEN PROVINCIA_x geometry location PROVINCIA_y year2020 year2021 year2022 year2023 year2024 flag
0 1.0 AMAZONAS CHACHAPOYAS POLYGON ((-77.72614 -5.94354, -77.72486 -5.943... AMAZONAS+CHACHAPOYAS CHACHAPOYAS 0.273486 0.321394 0.268201 0.417476 0.440860 both
1 2.0 AMAZONAS BAGUA POLYGON ((-78.61909 -4.51001, -78.61802 -4.510... AMAZONAS+BAGUA BAGUA 0.370885 0.391144 0.339266 0.533333 0.458333 both
2 3.0 AMAZONAS BONGARA POLYGON ((-77.72759 -5.14030, -77.72361 -5.140... AMAZONAS+BONGARA BONGARA 0.348485 0.363825 0.305233 0.500000 0.600000 both
3 4.0 AMAZONAS CONDORCANQUI POLYGON ((-77.81399 -2.99278, -77.81483 -2.995... AMAZONAS+CONDORCANQUI CONDORCANQUI 0.238017 0.339367 0.205714 0.000000 0.000000 both
4 5.0 AMAZONAS LUYA POLYGON ((-78.13023 -5.90370, -78.13011 -5.904... AMAZONAS+LUYA LUYA 0.383117 0.368317 0.309783 0.346154 0.400000 both
InΒ [354]:
# filling with zeroes
covid19_vulnerables_Alarm_map.fillna(0,inplace=True)

We can save this geoDF:

InΒ [356]:
import os
covid19_vulnerables_Alarm_map.to_file(
    os.path.join('C:\\Users\\Luis\\Documents\\GitHub\\covid_19', "provinciasPeru.gpkg"),
    layer='provinciasCovid19',
    driver="GPKG"
)

Exploring one variableΒΆ

This time, we explore statistically one variable in the map:

InΒ [359]:
# statistics
covid19_vulnerables_Alarm_map.year2022.describe()
Out[359]:
count    221.000000
mean       0.324225
std        0.067366
min        0.000000
25%        0.289458
50%        0.321721
75%        0.360688
max        0.600000
Name: year2022, dtype: float64

A visual look:

InΒ [361]:
import seaborn as sea

sea.boxplot(covid19_vulnerables_Alarm_map.year2022, color='yellow',orient='h')
Out[361]:
<Axes: xlabel='year2022'>
No description has been provided for this image
InΒ [362]:
from sklearn.preprocessing import QuantileTransformer
qt = QuantileTransformer(n_quantiles=100, random_state=0,output_distribution='normal')
qt_result=qt.fit_transform(covid19_vulnerables_Alarm_map[['year2022']])
sea.boxplot(qt_result, color='yellow',orient='h')
Out[362]:
<Axes: >
No description has been provided for this image
InΒ [363]:
covid19_vulnerables_Alarm_map['year_2022_qt']=qt_result

Spatial CorrelationΒΆ

NeighboorhoodΒΆ

We can compute the neighborhood in a map using different algorithms:

InΒ [365]:
!pip install libpysal
Collecting libpysal
  Downloading libpysal-4.12.1-py3-none-any.whl.metadata (4.8 kB)
Requirement already satisfied: beautifulsoup4>=4.10 in c:\users\luis\anaconda3\lib\site-packages (from libpysal) (4.12.3)
Requirement already satisfied: geopandas>=0.10.0 in c:\users\luis\anaconda3\lib\site-packages (from libpysal) (0.14.2)
Requirement already satisfied: numpy>=1.22 in c:\users\luis\anaconda3\lib\site-packages (from libpysal) (1.26.4)
Requirement already satisfied: packaging>=22 in c:\users\luis\anaconda3\lib\site-packages (from libpysal) (23.2)
Requirement already satisfied: pandas>=1.4 in c:\users\luis\anaconda3\lib\site-packages (from libpysal) (2.2.2)
Requirement already satisfied: platformdirs>=2.0.2 in c:\users\luis\anaconda3\lib\site-packages (from libpysal) (3.10.0)
Requirement already satisfied: requests>=2.27 in c:\users\luis\anaconda3\lib\site-packages (from libpysal) (2.32.2)
Requirement already satisfied: scipy>=1.8 in c:\users\luis\anaconda3\lib\site-packages (from libpysal) (1.13.1)
Requirement already satisfied: shapely>=2.0.1 in c:\users\luis\anaconda3\lib\site-packages (from libpysal) (2.0.5)
Requirement already satisfied: scikit-learn>=1.1 in c:\users\luis\anaconda3\lib\site-packages (from libpysal) (1.4.2)
Requirement already satisfied: soupsieve>1.2 in c:\users\luis\anaconda3\lib\site-packages (from beautifulsoup4>=4.10->libpysal) (2.5)
Requirement already satisfied: fiona>=1.8.21 in c:\users\luis\anaconda3\lib\site-packages (from geopandas>=0.10.0->libpysal) (1.9.5)
Requirement already satisfied: pyproj>=3.3.0 in c:\users\luis\anaconda3\lib\site-packages (from geopandas>=0.10.0->libpysal) (3.6.1)
Requirement already satisfied: python-dateutil>=2.8.2 in c:\users\luis\anaconda3\lib\site-packages (from pandas>=1.4->libpysal) (2.9.0.post0)
Requirement already satisfied: pytz>=2020.1 in c:\users\luis\anaconda3\lib\site-packages (from pandas>=1.4->libpysal) (2024.1)
Requirement already satisfied: tzdata>=2022.7 in c:\users\luis\anaconda3\lib\site-packages (from pandas>=1.4->libpysal) (2023.3)
Requirement already satisfied: charset-normalizer<4,>=2 in c:\users\luis\anaconda3\lib\site-packages (from requests>=2.27->libpysal) (2.0.4)
Requirement already satisfied: idna<4,>=2.5 in c:\users\luis\anaconda3\lib\site-packages (from requests>=2.27->libpysal) (3.7)
Requirement already satisfied: urllib3<3,>=1.21.1 in c:\users\luis\anaconda3\lib\site-packages (from requests>=2.27->libpysal) (2.2.2)
Requirement already satisfied: certifi>=2017.4.17 in c:\users\luis\anaconda3\lib\site-packages (from requests>=2.27->libpysal) (2024.8.30)
Requirement already satisfied: joblib>=1.2.0 in c:\users\luis\anaconda3\lib\site-packages (from scikit-learn>=1.1->libpysal) (1.4.2)
Requirement already satisfied: threadpoolctl>=2.0.0 in c:\users\luis\anaconda3\lib\site-packages (from scikit-learn>=1.1->libpysal) (2.2.0)
Requirement already satisfied: attrs>=19.2.0 in c:\users\luis\anaconda3\lib\site-packages (from fiona>=1.8.21->geopandas>=0.10.0->libpysal) (23.1.0)
Requirement already satisfied: click~=8.0 in c:\users\luis\anaconda3\lib\site-packages (from fiona>=1.8.21->geopandas>=0.10.0->libpysal) (8.1.7)
Requirement already satisfied: click-plugins>=1.0 in c:\users\luis\anaconda3\lib\site-packages (from fiona>=1.8.21->geopandas>=0.10.0->libpysal) (1.1.1)
Requirement already satisfied: cligj>=0.5 in c:\users\luis\anaconda3\lib\site-packages (from fiona>=1.8.21->geopandas>=0.10.0->libpysal) (0.7.2)
Requirement already satisfied: six in c:\users\luis\anaconda3\lib\site-packages (from fiona>=1.8.21->geopandas>=0.10.0->libpysal) (1.16.0)
Requirement already satisfied: setuptools in c:\users\luis\anaconda3\lib\site-packages (from fiona>=1.8.21->geopandas>=0.10.0->libpysal) (69.5.1)
Requirement already satisfied: colorama in c:\users\luis\anaconda3\lib\site-packages (from click~=8.0->fiona>=1.8.21->geopandas>=0.10.0->libpysal) (0.4.6)
Downloading libpysal-4.12.1-py3-none-any.whl (2.8 MB)
   ---------------------------------------- 0.0/2.8 MB ? eta -:--:--
   ---------------------------------------- 0.0/2.8 MB ? eta -:--:--
   ---------------------------------------- 0.0/2.8 MB ? eta -:--:--
   ---------------------------------------- 0.0/2.8 MB ? eta -:--:--
   - -------------------------------------- 0.1/2.8 MB 573.4 kB/s eta 0:00:05
   ------- -------------------------------- 0.5/2.8 MB 2.7 MB/s eta 0:00:01
   -------------- ------------------------- 1.0/2.8 MB 4.3 MB/s eta 0:00:01
   ---------------------- ----------------- 1.6/2.8 MB 5.9 MB/s eta 0:00:01
   ------------------------------ --------- 2.2/2.8 MB 6.9 MB/s eta 0:00:01
   ---------------------------------------  2.8/2.8 MB 7.8 MB/s eta 0:00:01
   ---------------------------------------  2.8/2.8 MB 7.8 MB/s eta 0:00:01
   ---------------------------------------- 2.8/2.8 MB 6.4 MB/s eta 0:00:00
Installing collected packages: libpysal
Successfully installed libpysal-4.12.1
InΒ [366]:
from libpysal.weights import Queen, Rook, KNN

# rook

w_rook = Rook.from_dataframe(covid19_vulnerables_Alarm_map,use_index=False)
InΒ [367]:
# rook
w_queen = Queen.from_dataframe(covid19_vulnerables_Alarm_map,use_index=False)
InΒ [368]:
# k nearest neighbors
w_knn = KNN.from_dataframe(covid19_vulnerables_Alarm_map, k=8)

Let's understand the differences:

InΒ [370]:
# first one
covid19_vulnerables_Alarm_map.head(1)
Out[370]:
OBJECTID DEPARTAMEN PROVINCIA_x geometry location PROVINCIA_y year2020 year2021 year2022 year2023 year2024 flag year_2022_qt
0 1.0 AMAZONAS CHACHAPOYAS POLYGON ((-77.72614 -5.94354, -77.72486 -5.943... AMAZONAS+CHACHAPOYAS CHACHAPOYAS 0.273486 0.321394 0.268201 0.417476 0.44086 both -0.932398
InΒ [371]:
# amount neighbors of that district
w_rook.neighbors[0]
Out[371]:
[2, 63, 4, 5, 139, 205, 207]
InΒ [372]:
# see
base=covid19_vulnerables_Alarm_map[covid19_vulnerables_Alarm_map.PROVINCIA_x=="CHACHAPOYAS"].plot()
covid19_vulnerables_Alarm_map.iloc[w_rook.neighbors[0] ,].plot(ax=base,facecolor="yellow",edgecolor='k')
covid19_vulnerables_Alarm_map.head(1).plot(ax=base,facecolor="red")
Out[372]:
<Axes: >
No description has been provided for this image

Let's do the same:

InΒ [374]:
w_queen.neighbors[0]
Out[374]:
[2, 63, 4, 5, 139, 205, 207]
InΒ [375]:
base=covid19_vulnerables_Alarm_map[covid19_vulnerables_Alarm_map.PROVINCIA_x=="CHACHAPOYAS"].plot()
covid19_vulnerables_Alarm_map.iloc[w_queen.neighbors[0] ,].plot(ax=base,facecolor="yellow",edgecolor='k')
covid19_vulnerables_Alarm_map.head(1).plot(ax=base,facecolor="red")
Out[375]:
<Axes: >
No description has been provided for this image
InΒ [376]:
w_knn.neighbors[0]
Out[376]:
[5, 4, 63, 207, 2, 67, 200, 139]
InΒ [377]:
base=covid19_vulnerables_Alarm_map[covid19_vulnerables_Alarm_map.PROVINCIA_x=="CHACHAPOYAS"].plot()
covid19_vulnerables_Alarm_map.iloc[w_knn.neighbors[0] ,].plot(ax=base,facecolor="yellow",edgecolor='k')
covid19_vulnerables_Alarm_map.head(1).plot(ax=base,facecolor="red")
Out[377]:
<Axes: >
No description has been provided for this image

Let me pay attention to the queen results:

InΒ [379]:
# all the neighbors by row
w_queen.neighbors
Out[379]:
{0: [2, 63, 4, 5, 139, 205, 207],
 1: [3, 68, 69, 6],
 2: [0, 207, 3, 4, 6, 168, 200],
 3: [168, 1, 2, 6],
 4: [0, 64, 2, 6, 63],
 5: [0, 200, 203, 205, 207],
 6: [64, 1, 2, 3, 68, 4, 66],
 7: [34, 20, 22, 8, 24, 25, 31],
 8: [25, 31, 7],
 9: [98, 10, 11, 12, 13, 14, 15, 16, 17, 21, 24],
 10: [98, 9, 11, 12, 13, 14, 15, 16, 17, 21, 24],
 11: [98, 9, 10, 12, 13, 14, 15, 16, 17, 21, 24],
 12: [98, 9, 10, 11, 13, 14, 15, 16, 17, 21, 24],
 13: [98, 9, 10, 11, 12, 14, 15, 16, 17, 21, 24],
 14: [98, 9, 10, 11, 12, 13, 15, 16, 17, 21, 24],
 15: [98, 9, 10, 11, 12, 13, 14, 16, 17, 21, 24],
 16: [98, 9, 10, 11, 12, 13, 14, 15, 17, 21, 24],
 17: [98, 9, 10, 11, 12, 13, 14, 15, 16, 21, 24],
 18: [24, 34, 20, 21],
 19: [97, 99, 104, 24, 153, 154, 28, 25, 31],
 20: [24, 34, 18, 7],
 21: [98, 34, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 24, 27],
 22: [32, 25, 34, 7],
 23: [32, 33, 26, 29],
 24: [7, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 31, 98, 99],
 25: [19, 22, 7, 8, 153, 31],
 26: [32, 33, 34, 23, 30],
 27: [34, 98, 21, 101, 30],
 28: [153, 154, 19, 159],
 29: [32, 33, 144, 146, 148, 23],
 30: [144, 33, 34, 101, 26, 27],
 31: [19, 7, 24, 8, 25],
 32: [34, 146, 148, 22, 23, 26, 29],
 33: [144, 23, 26, 29, 30],
 34: [32, 7, 18, 20, 21, 22, 26, 27, 30],
 35: [83, 36, 37, 38, 39, 41, 77],
 36: [35, 38, 40, 83, 54, 55, 56, 58, 60],
 37: [81, 49, 35, 38, 39, 56, 41],
 38: [56, 35, 36, 37],
 39: [81, 35, 84, 37, 41, 77],
 40: [50, 36, 54, 60],
 41: [35, 37, 39],
 42: [48, 197, 46, 43, 174],
 43: [48, 42, 44, 45, 46, 47],
 44: [55,
  56,
  57,
  43,
  108,
  109,
  110,
  47,
  111,
  49,
  112,
  113,
  114,
  115,
  116,
  119,
  120,
  121,
  122,
  123,
  117,
  125,
  118,
  124],
 45: [43, 46, 47],
 46: [193, 197, 42, 43, 45, 47, 81, 82],
 47: [49, 81, 82, 43, 44, 45, 46],
 48: [42, 43, 173, 174, 175],
 49: [81, 37, 56, 57, 44, 47],
 50: [51, 53, 54, 40, 90, 60, 93],
 51: [50, 59, 60, 93],
 52: [59, 93, 55],
 53: [133, 50, 83, 54, 89, 90, 92, 94],
 54: [50, 83, 36, 53, 40],
 55: [119,
  56,
  93,
  36,
  58,
  59,
  44,
  108,
  109,
  111,
  110,
  113,
  114,
  115,
  52,
  116,
  117,
  112,
  120,
  121,
  122,
  123,
  118,
  125,
  126,
  124],
 56: [49, 36, 37, 38, 55, 57, 44],
 57: [56, 49, 44],
 58: [59, 36, 60, 55],
 59: [51, 52, 55, 58, 60, 93],
 60: [50, 51, 36, 40, 58, 59],
 61: [65, 67, 147, 70, 72, 62, 63],
 62: [145, 147, 70, 139, 61, 142],
 63: [0, 64, 67, 4, 70, 139, 61],
 64: [66, 67, 4, 6, 73, 149, 150, 63],
 65: [147, 71, 72, 138, 140, 61, 143],
 66: [64, 68, 150, 6],
 67: [64, 71, 72, 73, 61, 63],
 68: [1, 66, 69, 6, 181, 150, 151],
 69: [1, 68, 181],
 70: [139, 61, 62, 63],
 71: [65, 67, 149, 72, 73, 140],
 72: [65, 67, 61, 71],
 73: [64, 67, 149, 71],
 74: [152],
 75: [84, 86, 87, 77, 78],
 76: [80, 81, 84, 86, 79],
 77: [35, 83, 84, 39, 87, 75],
 78: [83, 85, 86, 87, 75, 171],
 79: [80, 81, 194, 82, 76],
 80: [194, 86, 76, 189, 79],
 81: [37, 39, 76, 46, 47, 79, 49, 82, 84],
 82: [193, 194, 81, 79, 46, 47],
 83: [35, 36, 133, 171, 77, 78, 53, 54, 87, 218],
 84: [81, 86, 39, 75, 76, 77],
 85: [78, 171, 86],
 86: [75, 76, 171, 78, 80, 84, 85, 189],
 87: [75, 83, 77, 78],
 88: [128, 161, 89, 90, 91, 92, 93, 94],
 89: [88, 90, 92, 53],
 90: [50, 53, 88, 89, 93],
 91: [161, 88, 107, 93, 127],
 92: [88, 89, 53, 94],
 93: [59, 106, 50, 51, 52, 55, 88, 90, 91, 126, 127],
 94: [128, 53, 133, 88, 92],
 95: [96, 97, 100, 102, 104, 105],
 96: [176, 177, 102, 104, 95],
 97: [99, 100, 19, 104, 105, 95],
 98: [99, 100, 101, 9, 10, 11, 12, 13, 14, 15, 16, 17, 21, 24, 27],
 99: [97, 98, 19, 100, 24],
 100: [97, 98, 99, 101, 102, 167, 103, 201, 209, 219, 95],
 101: [144, 209, 98, 100, 27, 30],
 102: [96, 176, 178, 100, 103, 95],
 103: [178, 100, 102, 217, 219],
 104: [96, 97, 160, 105, 177, 19, 154, 95],
 105: [104, 97, 95],
 106: [93,
  108,
  109,
  110,
  111,
  112,
  113,
  114,
  115,
  116,
  117,
  118,
  119,
  120,
  121,
  122,
  123,
  124,
  125,
  126,
  127],
 107: [161, 91, 156, 127],
 108: [119,
  106,
  44,
  109,
  110,
  111,
  112,
  113,
  114,
  115,
  116,
  117,
  118,
  55,
  120,
  121,
  122,
  123,
  124,
  125,
  126],
 109: [119,
  106,
  44,
  108,
  110,
  111,
  112,
  113,
  114,
  115,
  116,
  117,
  118,
  55,
  120,
  121,
  122,
  123,
  124,
  125,
  126],
 110: [119,
  106,
  44,
  108,
  109,
  111,
  112,
  113,
  114,
  115,
  116,
  117,
  118,
  55,
  120,
  121,
  122,
  123,
  124,
  125,
  126],
 111: [119,
  106,
  44,
  108,
  109,
  110,
  112,
  113,
  114,
  115,
  116,
  117,
  118,
  55,
  120,
  121,
  122,
  123,
  124,
  125,
  126],
 112: [119,
  106,
  44,
  108,
  109,
  110,
  111,
  113,
  114,
  115,
  116,
  117,
  118,
  55,
  120,
  121,
  122,
  123,
  124,
  125,
  126],
 113: [119,
  106,
  44,
  108,
  109,
  110,
  111,
  112,
  114,
  115,
  116,
  117,
  118,
  55,
  120,
  121,
  122,
  123,
  124,
  125,
  126],
 114: [119,
  106,
  44,
  108,
  109,
  110,
  111,
  112,
  113,
  115,
  116,
  117,
  118,
  55,
  120,
  121,
  122,
  123,
  124,
  125,
  126],
 115: [119,
  106,
  44,
  108,
  109,
  110,
  111,
  112,
  113,
  114,
  116,
  117,
  118,
  55,
  120,
  121,
  122,
  123,
  124,
  125,
  126],
 116: [119,
  106,
  44,
  108,
  109,
  110,
  111,
  112,
  113,
  114,
  115,
  117,
  118,
  55,
  120,
  121,
  122,
  123,
  124,
  125,
  126],
 117: [119,
  106,
  44,
  108,
  109,
  110,
  111,
  112,
  113,
  114,
  115,
  116,
  118,
  55,
  120,
  121,
  122,
  123,
  124,
  125,
  126],
 118: [119,
  106,
  44,
  108,
  109,
  110,
  111,
  112,
  113,
  114,
  115,
  116,
  117,
  55,
  120,
  121,
  122,
  123,
  124,
  125,
  126],
 119: [120,
  106,
  44,
  108,
  109,
  110,
  111,
  112,
  113,
  114,
  115,
  116,
  117,
  55,
  118,
  121,
  122,
  123,
  124,
  125,
  126],
 120: [119,
  106,
  44,
  108,
  109,
  110,
  111,
  112,
  113,
  114,
  115,
  116,
  117,
  55,
  118,
  121,
  122,
  123,
  124,
  125,
  126],
 121: [119,
  120,
  106,
  44,
  108,
  109,
  110,
  111,
  112,
  113,
  114,
  115,
  116,
  117,
  55,
  118,
  122,
  123,
  124,
  125,
  126],
 122: [119,
  120,
  106,
  44,
  108,
  109,
  110,
  111,
  112,
  113,
  114,
  115,
  116,
  117,
  55,
  118,
  121,
  123,
  124,
  125,
  126],
 123: [119,
  120,
  106,
  44,
  108,
  109,
  110,
  111,
  112,
  113,
  114,
  115,
  116,
  117,
  55,
  118,
  121,
  122,
  124,
  125,
  126],
 124: [119,
  120,
  106,
  44,
  108,
  109,
  110,
  111,
  112,
  113,
  114,
  115,
  116,
  117,
  55,
  118,
  121,
  122,
  123,
  125,
  126],
 125: [119,
  120,
  106,
  44,
  108,
  109,
  110,
  111,
  112,
  113,
  114,
  115,
  116,
  117,
  55,
  118,
  121,
  122,
  123,
  124,
  126],
 126: [119,
  93,
  106,
  108,
  109,
  110,
  111,
  112,
  113,
  114,
  115,
  116,
  117,
  118,
  55,
  120,
  121,
  122,
  123,
  124,
  125],
 127: [91, 106, 107, 93],
 128: [161, 129, 133, 136, 88, 94],
 129: [128, 161, 131, 133, 136],
 130: [178, 131, 132, 133, 134],
 131: [129, 130, 161, 133, 134, 135, 158],
 132: [176, 178, 130, 134, 135],
 133: [128, 129, 130, 131, 178, 83, 53, 218, 94],
 134: [130, 131, 132, 135],
 135: [176, 131, 132, 134, 155, 157, 158],
 136: [128, 161, 129],
 137: [138, 148, 141, 142],
 138: [65, 147, 137, 142, 143],
 139: [0, 144, 145, 70, 205, 62, 63],
 140: [143, 65, 149, 71],
 141: [137, 146, 148, 142],
 142: [145, 146, 147, 137, 138, 141, 62],
 143: [65, 138, 140],
 144: [33, 101, 139, 205, 209, 146, 145, 29, 30],
 145: [144, 146, 142, 139, 62],
 146: [32, 144, 145, 29, 148, 141, 142],
 147: [65, 138, 142, 61, 62],
 148: [32, 146, 141, 137, 29],
 149: [64, 150, 71, 151, 73, 140],
 150: [64, 66, 68, 149, 151],
 151: [179, 68, 181, 150, 149, 182, 186],
 152: [74, 155, 156, 157, 158],
 153: [25, 19, 28, 159],
 154: [160, 19, 104, 28, 159],
 155: [152, 157, 158, 135],
 156: [152, 161, 107, 158],
 157: [176, 135, 152, 155, 159],
 158: [161, 131, 135, 152, 155, 156],
 159: [160, 176, 153, 154, 28, 157],
 160: [176, 177, 104, 154, 159],
 161: [128, 129, 131, 136, 107, 88, 91, 156, 158],
 162: [169, 164, 165, 166],
 163: [208, 164, 166, 200, 204, 168],
 164: [168, 162, 163, 166],
 165: [169, 162, 166],
 166: [208, 162, 163, 164, 165, 167, 217],
 167: [208, 100, 217, 166, 201, 219, 206],
 168: [2, 3, 163, 164, 200],
 169: [162, 165],
 170: [198, 218, 171, 172, 189],
 171: [83, 85, 86, 218, 170, 189, 78],
 172: [170, 218, 220],
 173: [48, 191, 211, 212, 187, 174, 175],
 174: [48, 197, 42, 187, 173],
 175: [48, 212, 173],
 176: [96, 160, 132, 102, 135, 177, 178, 157, 159],
 177: [96, 104, 176, 160],
 178: [130, 132, 133, 102, 103, 176, 217, 218],
 179: [180, 182, 151, 184, 183, 186],
 180: [184, 179, 181, 182],
 181: [68, 69, 182, 151, 180],
 182: [179, 180, 181, 151],
 183: [184, 185, 186, 179],
 184: [179, 180, 214, 183, 185, 215],
 185: [184, 215, 183],
 186: [151, 179, 183],
 187: [192, 197, 173, 174, 191],
 188: [192, 193, 194, 196, 197, 189],
 189: [194, 196, 198, 170, 171, 80, 86, 188],
 190: [199, 191],
 191: [211, 213, 187, 173, 190],
 192: [195, 196, 197, 187, 188],
 193: [194, 82, 197, 188, 46],
 194: [80, 193, 82, 188, 189, 79],
 195: [192],
 196: [192, 188, 189, 198],
 197: [192, 193, 188, 42, 187, 174, 46],
 198: [170, 196, 189],
 199: [190],
 200: [2, 163, 5, 168, 202, 203, 204, 207],
 201: [209, 100, 167, 202, 203, 205, 206],
 202: [200, 201, 203, 204, 206],
 203: [5, 200, 201, 202, 205],
 204: [208, 163, 200, 202, 206],
 205: [0, 144, 209, 5, 201, 139, 203],
 206: [208, 167, 201, 202, 204],
 207: [0, 2, 5, 200],
 208: [163, 166, 167, 204, 206],
 209: [144, 100, 101, 201, 205],
 210: [211, 212, 213],
 211: [210, 212, 213, 173, 191],
 212: [210, 211, 213, 173, 175],
 213: [210, 211, 212, 191],
 214: [184, 216, 215],
 215: [184, 185, 214],
 216: [214],
 217: [178, 166, 103, 167, 218, 219],
 218: [133, 170, 171, 172, 178, 83, 217, 220],
 219: [167, 100, 217, 103],
 220: [218, 172]}
InΒ [380]:
# the matrix of neighboorhood:

pd.DataFrame(*w_queen.full()).astype(int) # 1 means both are neighbors
Out[380]:
0 1 2 3 4 5 6 7 8 9 ... 211 212 213 214 215 216 217 218 219 220
0 0 0 1 0 1 1 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 0
1 0 0 0 1 0 0 1 0 0 0 ... 0 0 0 0 0 0 0 0 0 0
2 1 0 0 1 1 0 1 0 0 0 ... 0 0 0 0 0 0 0 0 0 0
3 0 1 1 0 0 0 1 0 0 0 ... 0 0 0 0 0 0 0 0 0 0
4 1 0 1 0 0 0 1 0 0 0 ... 0 0 0 0 0 0 0 0 0 0
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
216 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 1 0 0 0 0 0 0
217 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 1 1 0
218 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 1 0 0 1
219 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 1 0 0 0
220 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 1 0 0

221 rows Γ— 221 columns

InΒ [381]:
# pct of neighboorhood (density)
w_queen.pct_nonzero
Out[381]:
3.3005057226510512
InΒ [382]:
# a province with NO neighbor?
w_queen.islands
Out[382]:
[]

Moran's correlationΒΆ

We need the neighboorhood matrix (the weight matrix) to compute spatial correlation: if the variable value is correlated with the values of its neighbors - which proves a spatial effect.

InΒ [384]:
# needed for spatial correlation
w_queen.transform = 'R'
InΒ [385]:
pd.DataFrame(*w_queen.full()).sum(axis=1) # 1 means both are neighbors
Out[385]:
0      1.0
1      1.0
2      1.0
3      1.0
4      1.0
      ... 
216    1.0
217    1.0
218    1.0
219    1.0
220    1.0
Length: 221, dtype: float64

Spatial correlation is measured by the Moran's I statistic:

InΒ [387]:
!pip install esda
Collecting esda
  Downloading esda-2.6.0-py3-none-any.whl.metadata (2.0 kB)
Requirement already satisfied: geopandas>=0.12 in c:\users\luis\anaconda3\lib\site-packages (from esda) (0.14.2)
Requirement already satisfied: libpysal>=4.12 in c:\users\luis\anaconda3\lib\site-packages (from esda) (4.12.1)
Requirement already satisfied: numpy>=1.24 in c:\users\luis\anaconda3\lib\site-packages (from esda) (1.26.4)
Requirement already satisfied: pandas>1.5 in c:\users\luis\anaconda3\lib\site-packages (from esda) (2.2.2)
Requirement already satisfied: scikit-learn>=1.2 in c:\users\luis\anaconda3\lib\site-packages (from esda) (1.4.2)
Requirement already satisfied: scipy>=1.9 in c:\users\luis\anaconda3\lib\site-packages (from esda) (1.13.1)
Requirement already satisfied: shapely>=2.0 in c:\users\luis\anaconda3\lib\site-packages (from esda) (2.0.5)
Requirement already satisfied: fiona>=1.8.21 in c:\users\luis\anaconda3\lib\site-packages (from geopandas>=0.12->esda) (1.9.5)
Requirement already satisfied: packaging in c:\users\luis\anaconda3\lib\site-packages (from geopandas>=0.12->esda) (23.2)
Requirement already satisfied: pyproj>=3.3.0 in c:\users\luis\anaconda3\lib\site-packages (from geopandas>=0.12->esda) (3.6.1)
Requirement already satisfied: beautifulsoup4>=4.10 in c:\users\luis\anaconda3\lib\site-packages (from libpysal>=4.12->esda) (4.12.3)
Requirement already satisfied: platformdirs>=2.0.2 in c:\users\luis\anaconda3\lib\site-packages (from libpysal>=4.12->esda) (3.10.0)
Requirement already satisfied: requests>=2.27 in c:\users\luis\anaconda3\lib\site-packages (from libpysal>=4.12->esda) (2.32.2)
Requirement already satisfied: python-dateutil>=2.8.2 in c:\users\luis\anaconda3\lib\site-packages (from pandas>1.5->esda) (2.9.0.post0)
Requirement already satisfied: pytz>=2020.1 in c:\users\luis\anaconda3\lib\site-packages (from pandas>1.5->esda) (2024.1)
Requirement already satisfied: tzdata>=2022.7 in c:\users\luis\anaconda3\lib\site-packages (from pandas>1.5->esda) (2023.3)
Requirement already satisfied: joblib>=1.2.0 in c:\users\luis\anaconda3\lib\site-packages (from scikit-learn>=1.2->esda) (1.4.2)
Requirement already satisfied: threadpoolctl>=2.0.0 in c:\users\luis\anaconda3\lib\site-packages (from scikit-learn>=1.2->esda) (2.2.0)
Requirement already satisfied: soupsieve>1.2 in c:\users\luis\anaconda3\lib\site-packages (from beautifulsoup4>=4.10->libpysal>=4.12->esda) (2.5)
Requirement already satisfied: attrs>=19.2.0 in c:\users\luis\anaconda3\lib\site-packages (from fiona>=1.8.21->geopandas>=0.12->esda) (23.1.0)
Requirement already satisfied: certifi in c:\users\luis\anaconda3\lib\site-packages (from fiona>=1.8.21->geopandas>=0.12->esda) (2024.8.30)
Requirement already satisfied: click~=8.0 in c:\users\luis\anaconda3\lib\site-packages (from fiona>=1.8.21->geopandas>=0.12->esda) (8.1.7)
Requirement already satisfied: click-plugins>=1.0 in c:\users\luis\anaconda3\lib\site-packages (from fiona>=1.8.21->geopandas>=0.12->esda) (1.1.1)
Requirement already satisfied: cligj>=0.5 in c:\users\luis\anaconda3\lib\site-packages (from fiona>=1.8.21->geopandas>=0.12->esda) (0.7.2)
Requirement already satisfied: six in c:\users\luis\anaconda3\lib\site-packages (from fiona>=1.8.21->geopandas>=0.12->esda) (1.16.0)
Requirement already satisfied: setuptools in c:\users\luis\anaconda3\lib\site-packages (from fiona>=1.8.21->geopandas>=0.12->esda) (69.5.1)
Requirement already satisfied: charset-normalizer<4,>=2 in c:\users\luis\anaconda3\lib\site-packages (from requests>=2.27->libpysal>=4.12->esda) (2.0.4)
Requirement already satisfied: idna<4,>=2.5 in c:\users\luis\anaconda3\lib\site-packages (from requests>=2.27->libpysal>=4.12->esda) (3.7)
Requirement already satisfied: urllib3<3,>=1.21.1 in c:\users\luis\anaconda3\lib\site-packages (from requests>=2.27->libpysal>=4.12->esda) (2.2.2)
Requirement already satisfied: colorama in c:\users\luis\anaconda3\lib\site-packages (from click~=8.0->fiona>=1.8.21->geopandas>=0.12->esda) (0.4.6)
Downloading esda-2.6.0-py3-none-any.whl (135 kB)
   ---------------------------------------- 0.0/135.4 kB ? eta -:--:--
   ---------------------------------------- 0.0/135.4 kB ? eta -:--:--
   --- ------------------------------------ 10.2/135.4 kB ? eta -:--:--
   --- ------------------------------------ 10.2/135.4 kB ? eta -:--:--
   ----------- --------------------------- 41.0/135.4 kB 281.8 kB/s eta 0:00:01
   -------------------------- ------------ 92.2/135.4 kB 585.1 kB/s eta 0:00:01
   -------------------------------------- 135.4/135.4 kB 727.5 kB/s eta 0:00:00
Installing collected packages: esda
Successfully installed esda-2.6.0

Spatial correlation is measured by the Moran's I statistic:

InΒ [389]:
from esda.moran import Moran

morancovid19 = Moran(covid19_vulnerables_Alarm_map['year_2022_qt'], w_queen)
morancovid19.I,morancovid19.p_sim
Out[389]:
(0.08667603252321662, 0.014)

The Moran's I is significant. Let's see:

InΒ [391]:
!pip install splot
Collecting splot
  Downloading splot-1.1.7-py3-none-any.whl.metadata (8.9 kB)
Requirement already satisfied: esda in c:\users\luis\anaconda3\lib\site-packages (from splot) (2.6.0)
Requirement already satisfied: geopandas>=0.9.0 in c:\users\luis\anaconda3\lib\site-packages (from splot) (0.14.2)
Collecting giddy (from splot)
  Downloading giddy-2.3.5-py3-none-any.whl.metadata (6.4 kB)
Requirement already satisfied: libpysal in c:\users\luis\anaconda3\lib\site-packages (from splot) (4.12.1)
Requirement already satisfied: mapclassify in c:\users\luis\anaconda3\lib\site-packages (from splot) (2.5.0)
Requirement already satisfied: matplotlib>=3.3.3 in c:\users\luis\anaconda3\lib\site-packages (from splot) (3.9.2)
Requirement already satisfied: numpy in c:\users\luis\anaconda3\lib\site-packages (from splot) (1.26.4)
Requirement already satisfied: packaging in c:\users\luis\anaconda3\lib\site-packages (from splot) (23.2)
Requirement already satisfied: seaborn>=0.11.0 in c:\users\luis\anaconda3\lib\site-packages (from splot) (0.13.2)
Collecting spreg (from splot)
  Downloading spreg-1.7-py3-none-any.whl.metadata (1.7 kB)
Requirement already satisfied: fiona>=1.8.21 in c:\users\luis\anaconda3\lib\site-packages (from geopandas>=0.9.0->splot) (1.9.5)
Requirement already satisfied: pandas>=1.4.0 in c:\users\luis\anaconda3\lib\site-packages (from geopandas>=0.9.0->splot) (2.2.2)
Requirement already satisfied: pyproj>=3.3.0 in c:\users\luis\anaconda3\lib\site-packages (from geopandas>=0.9.0->splot) (3.6.1)
Requirement already satisfied: shapely>=1.8.0 in c:\users\luis\anaconda3\lib\site-packages (from geopandas>=0.9.0->splot) (2.0.5)
Requirement already satisfied: contourpy>=1.0.1 in c:\users\luis\anaconda3\lib\site-packages (from matplotlib>=3.3.3->splot) (1.2.0)
Requirement already satisfied: cycler>=0.10 in c:\users\luis\anaconda3\lib\site-packages (from matplotlib>=3.3.3->splot) (0.11.0)
Requirement already satisfied: fonttools>=4.22.0 in c:\users\luis\anaconda3\lib\site-packages (from matplotlib>=3.3.3->splot) (4.51.0)
Requirement already satisfied: kiwisolver>=1.3.1 in c:\users\luis\anaconda3\lib\site-packages (from matplotlib>=3.3.3->splot) (1.4.4)
Requirement already satisfied: pillow>=8 in c:\users\luis\anaconda3\lib\site-packages (from matplotlib>=3.3.3->splot) (10.3.0)
Requirement already satisfied: pyparsing>=2.3.1 in c:\users\luis\anaconda3\lib\site-packages (from matplotlib>=3.3.3->splot) (3.0.9)
Requirement already satisfied: python-dateutil>=2.7 in c:\users\luis\anaconda3\lib\site-packages (from matplotlib>=3.3.3->splot) (2.9.0.post0)
Requirement already satisfied: scikit-learn>=1.2 in c:\users\luis\anaconda3\lib\site-packages (from esda->splot) (1.4.2)
Requirement already satisfied: scipy>=1.9 in c:\users\luis\anaconda3\lib\site-packages (from esda->splot) (1.13.1)
Requirement already satisfied: beautifulsoup4>=4.10 in c:\users\luis\anaconda3\lib\site-packages (from libpysal->splot) (4.12.3)
Requirement already satisfied: platformdirs>=2.0.2 in c:\users\luis\anaconda3\lib\site-packages (from libpysal->splot) (3.10.0)
Requirement already satisfied: requests>=2.27 in c:\users\luis\anaconda3\lib\site-packages (from libpysal->splot) (2.32.2)
Collecting quantecon>=0.4.7 (from giddy->splot)
  Downloading quantecon-0.7.2-py3-none-any.whl.metadata (4.9 kB)
Requirement already satisfied: networkx in c:\users\luis\anaconda3\lib\site-packages (from mapclassify->splot) (3.2.1)
Requirement already satisfied: soupsieve>1.2 in c:\users\luis\anaconda3\lib\site-packages (from beautifulsoup4>=4.10->libpysal->splot) (2.5)
Requirement already satisfied: attrs>=19.2.0 in c:\users\luis\anaconda3\lib\site-packages (from fiona>=1.8.21->geopandas>=0.9.0->splot) (23.1.0)
Requirement already satisfied: certifi in c:\users\luis\anaconda3\lib\site-packages (from fiona>=1.8.21->geopandas>=0.9.0->splot) (2024.8.30)
Requirement already satisfied: click~=8.0 in c:\users\luis\anaconda3\lib\site-packages (from fiona>=1.8.21->geopandas>=0.9.0->splot) (8.1.7)
Requirement already satisfied: click-plugins>=1.0 in c:\users\luis\anaconda3\lib\site-packages (from fiona>=1.8.21->geopandas>=0.9.0->splot) (1.1.1)
Requirement already satisfied: cligj>=0.5 in c:\users\luis\anaconda3\lib\site-packages (from fiona>=1.8.21->geopandas>=0.9.0->splot) (0.7.2)
Requirement already satisfied: six in c:\users\luis\anaconda3\lib\site-packages (from fiona>=1.8.21->geopandas>=0.9.0->splot) (1.16.0)
Requirement already satisfied: setuptools in c:\users\luis\anaconda3\lib\site-packages (from fiona>=1.8.21->geopandas>=0.9.0->splot) (69.5.1)
Requirement already satisfied: pytz>=2020.1 in c:\users\luis\anaconda3\lib\site-packages (from pandas>=1.4.0->geopandas>=0.9.0->splot) (2024.1)
Requirement already satisfied: tzdata>=2022.7 in c:\users\luis\anaconda3\lib\site-packages (from pandas>=1.4.0->geopandas>=0.9.0->splot) (2023.3)
Requirement already satisfied: numba>=0.49.0 in c:\users\luis\anaconda3\lib\site-packages (from quantecon>=0.4.7->giddy->splot) (0.59.1)
Requirement already satisfied: sympy in c:\users\luis\anaconda3\lib\site-packages (from quantecon>=0.4.7->giddy->splot) (1.12)
Requirement already satisfied: charset-normalizer<4,>=2 in c:\users\luis\anaconda3\lib\site-packages (from requests>=2.27->libpysal->splot) (2.0.4)
Requirement already satisfied: idna<4,>=2.5 in c:\users\luis\anaconda3\lib\site-packages (from requests>=2.27->libpysal->splot) (3.7)
Requirement already satisfied: urllib3<3,>=1.21.1 in c:\users\luis\anaconda3\lib\site-packages (from requests>=2.27->libpysal->splot) (2.2.2)
Requirement already satisfied: joblib>=1.2.0 in c:\users\luis\anaconda3\lib\site-packages (from scikit-learn>=1.2->esda->splot) (1.4.2)
Requirement already satisfied: threadpoolctl>=2.0.0 in c:\users\luis\anaconda3\lib\site-packages (from scikit-learn>=1.2->esda->splot) (2.2.0)
Requirement already satisfied: colorama in c:\users\luis\anaconda3\lib\site-packages (from click~=8.0->fiona>=1.8.21->geopandas>=0.9.0->splot) (0.4.6)
Requirement already satisfied: llvmlite<0.43,>=0.42.0dev0 in c:\users\luis\anaconda3\lib\site-packages (from numba>=0.49.0->quantecon>=0.4.7->giddy->splot) (0.42.0)
Requirement already satisfied: mpmath>=0.19 in c:\users\luis\anaconda3\lib\site-packages (from sympy->quantecon>=0.4.7->giddy->splot) (1.3.0)
Downloading splot-1.1.7-py3-none-any.whl (39 kB)
Downloading giddy-2.3.5-py3-none-any.whl (61 kB)
   ---------------------------------------- 0.0/61.1 kB ? eta -:--:--
   -------------------------- ------------- 41.0/61.1 kB 2.0 MB/s eta 0:00:01
   --------------------------------- ------ 51.2/61.1 kB 660.6 kB/s eta 0:00:01
   ---------------------------------------- 61.1/61.1 kB 546.7 kB/s eta 0:00:00
Downloading spreg-1.7-py3-none-any.whl (372 kB)
   ---------------------------------------- 0.0/372.8 kB ? eta -:--:--
   ------------------------------ --------- 286.7/372.8 kB 8.9 MB/s eta 0:00:01
   ---------------------------------------- 372.8/372.8 kB 5.8 MB/s eta 0:00:00
Downloading quantecon-0.7.2-py3-none-any.whl (215 kB)
   ---------------------------------------- 0.0/215.4 kB ? eta -:--:--
   ---------------------------------------- 215.4/215.4 kB 6.4 MB/s eta 0:00:00
Installing collected packages: quantecon, spreg, giddy, splot
Successfully installed giddy-2.3.5 quantecon-0.7.2 splot-1.1.7 spreg-1.7
InΒ [392]:
from splot.esda import moran_scatterplot
import matplotlib.pyplot as plt

fig, ax = moran_scatterplot(morancovid19)
ax.set_xlabel('Covid19_alarma_share')
ax.set_ylabel('SpatialLag_Covid19_alarma_share')
Out[392]:
Text(0, 0.5, 'SpatialLag_Covid19_alarma_share')
No description has been provided for this image

Local Spatial CorrelationΒΆ

We can compute a LISA (local Moran) for each case. That will help us find spatial clusters (spots) and spatial outliers:

  • A hotSpot is a polygon whose value in the variable is high AND is surrounded with polygons with also high values.

  • A coldSpot is a polygon whose value in the variable is low AND is surrounded with polygons with also low values.

  • A coldOutlier is a polygon whose value in the variable is low BUT is surrounded with polygons with high values.

  • A hotOutlier is a polygon whose value in the variable is high BUT is surrounded with polygons with low values.

It is also possible that no significant correlation is detected. Let's see those values:

InΒ [394]:
# The scatterplot with local info

from esda.moran import Moran_Local

# calculate Moran_Local and plot
lisacovid19 = Moran_Local(y=covid19_vulnerables_Alarm_map['year_2022_qt'], w=w_knn,seed=2022)
fig, ax = moran_scatterplot(lisacovid19,p=0.05)
ax.set_xlabel('Covid19_alarma_share')
ax.set_ylabel('SpatialLag_Covid19_alarma_share');
No description has been provided for this image
InΒ [395]:
from splot.esda import plot_local_autocorrelation
plot_local_autocorrelation(lisacovid19, covid19_vulnerables_Alarm_map,'year_2022_qt')
plt.show()
No description has been provided for this image

Let me add that data to my gdf:

InΒ [397]:
# quadrant
lisacovid19.q
Out[397]:
array([2, 4, 3, 3, 3, 4, 3, 2, 1, 1, 1, 1, 1, 2, 1, 1, 2, 2, 4, 1, 1, 4,
       1, 4, 1, 2, 1, 3, 1, 4, 4, 1, 2, 4, 1, 2, 2, 1, 1, 2, 1, 2, 2, 1,
       3, 1, 1, 1, 4, 1, 2, 1, 1, 2, 2, 4, 1, 1, 1, 2, 1, 2, 4, 4, 4, 1,
       3, 2, 3, 2, 2, 2, 1, 4, 3, 3, 2, 4, 4, 4, 4, 4, 2, 3, 3, 4, 3, 3,
       2, 1, 2, 4, 4, 2, 1, 3, 4, 4, 1, 3, 4, 3, 3, 3, 3, 2, 3, 3, 4, 3,
       3, 4, 3, 3, 4, 3, 4, 4, 3, 3, 3, 3, 3, 3, 3, 4, 4, 3, 2, 1, 4, 1,
       4, 2, 4, 2, 1, 1, 1, 4, 1, 3, 1, 1, 3, 3, 3, 1, 3, 2, 2, 4, 3, 1,
       1, 4, 2, 3, 4, 2, 2, 4, 3, 3, 3, 3, 3, 4, 3, 3, 3, 3, 3, 2, 1, 4,
       3, 3, 3, 2, 4, 4, 4, 2, 4, 2, 3, 2, 1, 2, 1, 1, 1, 1, 1, 1, 1, 2,
       1, 1, 3, 4, 3, 3, 4, 4, 2, 4, 3, 3, 2, 4, 2, 1, 2, 4, 4, 4, 4, 3,
       3])
InΒ [398]:
# significance
lisacovid19.p_sim
Out[398]:
array([0.218, 0.061, 0.464, 0.216, 0.446, 0.27 , 0.314, 0.042, 0.099,
       0.454, 0.28 , 0.378, 0.368, 0.181, 0.287, 0.271, 0.122, 0.135,
       0.395, 0.115, 0.202, 0.355, 0.14 , 0.467, 0.458, 0.022, 0.128,
       0.451, 0.075, 0.094, 0.404, 0.029, 0.366, 0.263, 0.255, 0.366,
       0.221, 0.235, 0.074, 0.4  , 0.237, 0.199, 0.252, 0.117, 0.428,
       0.272, 0.15 , 0.079, 0.449, 0.023, 0.293, 0.239, 0.068, 0.264,
       0.104, 0.161, 0.009, 0.013, 0.095, 0.024, 0.201, 0.205, 0.145,
       0.316, 0.404, 0.324, 0.346, 0.452, 0.422, 0.435, 0.318, 0.22 ,
       0.139, 0.384, 0.08 , 0.364, 0.396, 0.186, 0.229, 0.304, 0.265,
       0.359, 0.174, 0.319, 0.47 , 0.338, 0.388, 0.463, 0.075, 0.237,
       0.256, 0.487, 0.419, 0.295, 0.377, 0.179, 0.023, 0.035, 0.446,
       0.179, 0.01 , 0.267, 0.044, 0.417, 0.491, 0.496, 0.033, 0.295,
       0.021, 0.012, 0.021, 0.023, 0.023, 0.025, 0.01 , 0.029, 0.023,
       0.006, 0.023, 0.022, 0.022, 0.03 , 0.022, 0.022, 0.022, 0.024,
       0.015, 0.367, 0.044, 0.275, 0.486, 0.222, 0.412, 0.115, 0.441,
       0.352, 0.036, 0.447, 0.086, 0.168, 0.183, 0.24 , 0.375, 0.086,
       0.1  , 0.184, 0.223, 0.328, 0.379, 0.164, 0.473, 0.37 , 0.27 ,
       0.037, 0.377, 0.241, 0.474, 0.478, 0.395, 0.234, 0.407, 0.286,
       0.024, 0.068, 0.037, 0.136, 0.078, 0.097, 0.11 , 0.034, 0.446,
       0.301, 0.046, 0.424, 0.188, 0.353, 0.321, 0.233, 0.477, 0.459,
       0.429, 0.258, 0.178, 0.481, 0.494, 0.444, 0.492, 0.001, 0.017,
       0.011, 0.002, 0.06 , 0.009, 0.138, 0.286, 0.003, 0.017, 0.001,
       0.062, 0.003, 0.375, 0.001, 0.433, 0.329, 0.101, 0.188, 0.241,
       0.219, 0.265, 0.009, 0.138, 0.494, 0.135, 0.209, 0.388, 0.474,
       0.477, 0.343, 0.064, 0.013, 0.235])
InΒ [399]:
# quadrant: 1 HH,  2 LH,  3 LL,  4 HL
pd.Series(lisacovid19.q).value_counts()
Out[399]:
3    63
1    57
4    55
2    46
Name: count, dtype: int64

The info in lisacovid19.q can not be used right away, we need to add if the local spatial correlation is significant:

InΒ [401]:
covid19_vulnerables_Alarm_map['Covid19_quadrant']=[l if p <0.05 else 0 for l,p in zip(lisacovid19.q,lisacovid19.p_sim)  ]
covid19_vulnerables_Alarm_map['Covid19_quadrant'].value_counts()
Out[401]:
Covid19_quadrant
0    171
3     20
1     12
4     11
2      7
Name: count, dtype: int64

Now, we recode:

InΒ [403]:
labels = [ '0 no_sig', '1 hotSpot', '2 coldOutlier', '3 coldSpot', '4 hotOutlier']

covid19_vulnerables_Alarm_map['Covid19_quadrant_names']=[labels[i] for i in covid19_vulnerables_Alarm_map['Covid19_quadrant']]

covid19_vulnerables_Alarm_map['Covid19_quadrant_names'].value_counts()
Out[403]:
Covid19_quadrant_names
0 no_sig         171
3 coldSpot        20
1 hotSpot         12
4 hotOutlier      11
2 coldOutlier      7
Name: count, dtype: int64

Let's replot:

InΒ [405]:
from matplotlib import colors
myColMap = colors.ListedColormap([ 'ghostwhite', 'red', 'green', 'black','orange'])




f, ax = plt.subplots(1, figsize=(12,12))


plt.title('Spots and Outliers')

covid19_vulnerables_Alarm_map.plot(column='Covid19_quadrant_names',
                categorical=True,
                cmap=myColMap,
                linewidth=0.1,
                edgecolor='white',
                legend=True,
                legend_kwds={'loc': 'center left',
                             'bbox_to_anchor': (0.7, 0.6)},
                ax=ax)
# Remove axis
ax.set_axis_off()
# Display the map
plt.show()
No description has been provided for this image
InΒ [406]:
covid19_vulnerables_Alarm_map.explore("Covid19_quadrant_names", categorical=True,tooltip='location',cmap=myColMap)
Out[406]:
Make this Notebook Trusted to load map: File -> Trust Notebook
InΒ [407]:
import folium

map1=covid19_vulnerables_Alarm_map[covid19_vulnerables_Alarm_map.Covid19_quadrant_names=='1 hotSpot']
map2=covid19_vulnerables_Alarm_map[covid19_vulnerables_Alarm_map.Covid19_quadrant_names=='2 coldOutlier']
map3=covid19_vulnerables_Alarm_map[covid19_vulnerables_Alarm_map.Covid19_quadrant_names=='3 coldSpot']
map4=covid19_vulnerables_Alarm_map[covid19_vulnerables_Alarm_map.Covid19_quadrant_names=='4 hotOutlier']

m = map1.explore(
    color="red",
    tooltip=False,  # hide tooltip
    popup=["location"],  # (on-click)
    name="hotSpot"  # name of the layer in the map
)

map2.explore(
    m=m, # notice
    color="green",
    tooltip=False,
    popup=["location"],
    name="coldOutlier"
)

map3.explore(
    m=m,
    color="black",
    tooltip=False,
    popup=["location"],
    name="coldSpot",
)

map4.explore(
    m=m,
    color="orange",
    tooltip=False,
    popup=["location"],
    name="hotOutlier",
)

folium.TileLayer("CartoDB positron", show=False).add_to(m)  # use folium to add alternative tiles
folium.LayerControl(collapsed=True).add_to(m)  # use folium to add layer control

m  # show map
Out[407]:
Make this Notebook Trusted to load map: File -> Trust Notebook

Bivariate LISAΒΆ

InΒ [409]:
#from esda.moran import Moran_BV, Moran_Local_BV
from esda.moran import Moran_BV

mbi = Moran_BV(covid19_vulnerables_Alarm_map['year2021'],  covid19_vulnerables_Alarm_map['year2022'],  w_queen)
mbi.I,mbi.p_sim
Out[409]:
(0.07609880293390373, 0.011)
InΒ [410]:
# The scatterplot with local info
from esda.moran import Moran_Local_BV

# calculate Moran_Local and plot
lisacovid19_bv = Moran_Local_BV(y=covid19_vulnerables_Alarm_map['year2021'],
                               x=covid19_vulnerables_Alarm_map['year2022'],
                               w=w_queen)

fig, ax = moran_scatterplot(lisacovid19_bv, p=0.05,aspect_equal=True)

ax.set_xlabel('Covid19_2022')
ax.set_ylabel('SpatialLag_Covid19_2021')
plt.show()
No description has been provided for this image
InΒ [411]:
covid19_vulnerables_Alarm_map['Covid19_quadrant_21_22']=[l if p <0.05 else 0 for l,p in zip(lisacovid19_bv.q,lisacovid19_bv.p_sim)  ]

labels = [ '0 no_sig', '1 hotSpot', '2 coldOutlier', '3 coldSpot', '4 hotOutlier']

covid19_vulnerables_Alarm_map['Covid19_quadrant_21_22_names']=[labels[i] for i in covid19_vulnerables_Alarm_map['Covid19_quadrant_21_22']]
InΒ [412]:
# see new columns
covid19_vulnerables_Alarm_map
Out[412]:
OBJECTID DEPARTAMEN PROVINCIA_x geometry location PROVINCIA_y year2020 year2021 year2022 year2023 year2024 flag year_2022_qt Covid19_quadrant Covid19_quadrant_names Covid19_quadrant_21_22 Covid19_quadrant_21_22_names
0 1.0 AMAZONAS CHACHAPOYAS POLYGON ((-77.72614 -5.94354, -77.72486 -5.943... AMAZONAS+CHACHAPOYAS CHACHAPOYAS 0.273486 0.321394 0.268201 0.417476 0.440860 both -0.932398 0 0 no_sig 0 0 no_sig
1 2.0 AMAZONAS BAGUA POLYGON ((-78.61909 -4.51001, -78.61802 -4.510... AMAZONAS+BAGUA BAGUA 0.370885 0.391144 0.339266 0.533333 0.458333 both 0.348756 0 0 no_sig 0 0 no_sig
2 3.0 AMAZONAS BONGARA POLYGON ((-77.72759 -5.14030, -77.72361 -5.140... AMAZONAS+BONGARA BONGARA 0.348485 0.363825 0.305233 0.500000 0.600000 both -0.374322 0 0 no_sig 0 0 no_sig
3 4.0 AMAZONAS CONDORCANQUI POLYGON ((-77.81399 -2.99278, -77.81483 -2.995... AMAZONAS+CONDORCANQUI CONDORCANQUI 0.238017 0.339367 0.205714 0.000000 0.000000 both -1.897272 0 0 no_sig 0 0 no_sig
4 5.0 AMAZONAS LUYA POLYGON ((-78.13023 -5.90370, -78.13011 -5.904... AMAZONAS+LUYA LUYA 0.383117 0.368317 0.309783 0.346154 0.400000 both -0.276579 0 0 no_sig 0 0 no_sig
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
216 192.0 TUMBES ZARUMILLA POLYGON ((-80.28521 -3.41276, -80.28406 -3.412... TUMBES+ZARUMILLA ZARUMILLA 0.354447 0.305589 0.336237 0.235294 0.444444 both 0.308264 0 0 no_sig 0 0 no_sig
217 193.0 UCAYALI CORONEL PORTILLO POLYGON ((-74.47145 -7.27617, -74.47052 -7.277... UCAYALI+CORONELPORTILLO CORONEL PORTILLO 0.387321 0.342441 0.328023 0.404255 0.333333 both 0.155157 0 0 no_sig 0 0 no_sig
218 194.0 UCAYALI ATALAYA POLYGON ((-73.18146 -9.41174, -73.13475 -9.411... UCAYALI+ATALAYA ATALAYA 0.325243 0.241379 0.344828 0.000000 0.000000 both 0.460257 0 0 no_sig 4 4 hotOutlier
219 195.0 UCAYALI PADRE ABAD POLYGON ((-75.43663 -8.22999, -75.43651 -8.230... UCAYALI+PADREABAD PADRE ABAD 0.309686 0.332174 0.279487 0.071429 0.000000 both -0.788427 3 3 coldSpot 0 0 no_sig
220 196.0 UCAYALI PURUS POLYGON ((-70.61380 -9.87339, -70.62140 -9.878... UCAYALI+PURUS PURUS 0.224599 0.300000 0.172414 0.000000 0.000000 both -2.330804 0 0 no_sig 3 3 coldSpot

221 rows Γ— 17 columns

InΒ [413]:
from matplotlib import colors
myColMap = colors.ListedColormap([ 'ghostwhite', 'red', 'green', 'black','orange'])




f, ax = plt.subplots(1, figsize=(12,12))


plt.title('Spots and Outliers')

covid19_vulnerables_Alarm_map.plot(column='Covid19_quadrant_21_22_names',
                categorical=True,
                cmap=myColMap,
                linewidth=0.1,
                edgecolor='white',
                legend=True,
                legend_kwds={'loc': 'center left',
                             'bbox_to_anchor': (0.7, 0.6)},
                ax=ax)
# Remove axis
ax.set_axis_off()
# Display the map
plt.show()
No description has been provided for this image
InΒ [414]:
# the map with the spots and outliers

from splot.esda import lisa_cluster
f, ax = plt.subplots(1, figsize=(12, 12))
plt.title('Spots and Outliers')
fig = lisa_cluster(lisacovid19,
                   covid19_vulnerables_Alarm_map,ax=ax,
                   legend_kwds={'loc': 'center left',
                                'bbox_to_anchor': (0.7, 0.6)})
No description has been provided for this image

Use github para almacenar, publicar y presentar su trabajoΒΆ

Enlace al repositorio de la tarea 3: https://github.com/luispachecoc/covid_19

Enlace a GitHub Pages: https://luispachecoc.github.io/covid_19/